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Abstract 

We propose a new measure of quantum entanglement. Our measure is defined in 
terms of conditional information transmission for a Quantum Bayesian Net. We 
show that our measure is identically equal to the Entanglement of Formation in the 
case of a bipartite (two listener) system occupying a pure state. In the case of mixed 
states, the relationship between these two measures is not known yet. We discuss 
some properties of our measure. Our measure can be easily and naturally generalized 
to handle n-partite (n-listener) systems. It is non-negative for any n. It vanishes for 
conditionally separable states with n listeners. It is symmetric under permutations 
of the n listeners. It decreases if listeners are merged, pruned or removed. Most 
promising of all, it is intimately connected with the Data Processing Inequalities. We 
also find a new upper bound for classical mutual information which is of interest in 
its own right. 
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1 Introduction 



Quantum entanglement is at the very heart of Quantum Mechanics so there is a vast 
amount of hterature on the subject. Of particular interest to workers in the field of 
Quantum Information Theory are the issues of quantification and manipulation of 
entanglement. An important step in that direction was taken in Refs.[l]-[3]. These 
references introduced measures of entanglement called entanglement of formation and 
of distillation. Since Refs.[l]-[3], the implications of these two measures have been 
explored and clarified considerably by many workers[l]. And yet, the quantification 
of entanglement for mixed states and for more than two listeners is still not well 
understood. 

The goal of this paper is to shed some light on the quantification of entangle- 
ment by approaching it from a new perspective, that of Quantum Bayesian Nets and 
conditional information transmission. For a review of Quantum Information Theory 
from the point of view of quantum Bayesian nets, see Ref. [5]. Henceforth, we will 
assume that the reader is familiar with the notation of Ref. [5]. 




Figure 1: CB net in which a and b are conditionally independent. 



For motivation, consider the CB net of Fig.l. This net satisfies 

P{a,b,X) = P{a\X)P{b\X)P{X) . (1.1) 
Summing the last equation over A, one gets 

P{a,b) = Y.P{a\X)Pib\X)PiX). (1.2) 

A 

One says that a and b are conditionally independent. Eq.(1.2) is often used as the 
starting point in the derivation of Bell Inequalities [6]. In that context, A represents 
the hidden variables. As shown in Ref. [5], Eq.(l.l) implies 

H{{a:b)\X) = . (1.3) 

As we shall see in what follows, Sp{{a : b)\X), the quantum mechanical counterpart 
of H[[a : 6) I A), is NOT generally zero for a QB net with the graph of Fig.l. Thus, 
Sp{(a : b)\X) appears to be a good measure of quantum entanglement, which is a 
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phenomenon that does not occur classically. This paper is devoted to discussing 
Sp{{a: b)\X) and its generalizations. 

2 Entanglement of Formation 

In this section, we will give a very brief review of the most basic aspects of the 
Entanglement of Formation. 

Consider two Hilbert spaces Tix and Tiy which need not have the same dimen- 
sion. Without loss of generality, we will assume that the dimension A'^ of Ti^ is less 
than or equal to the dimension Ny of Tiy 

The entanglement of formation Ep for a bipartite pure state IV^) G 'Hx®'Hy is 
defined by 

Epm = s[tTymm- (2.1) 

Consider any density matrix p. If £^ = {{wa, |'0a))|Va} satisfies 

p = Y.Wa\i^a)m , (2.2) 
a 

then we say £^ is a p-ensemble. (This clearly defines an equivalence relationship). 
Ref. [7] characterizes all £ belonging to a given p. The entanglement of formation Ep 
for a bipartite mixed state with density matrix p acting on Ti^ ® "Hy is defined by 

Ep{p) = mm\^WaEp{\ija))^ , (2.3) 

where the minimum is taken over all ensembles 8 = {{wa, |'0a))|Va} which are p- 
ensembles. 

First, let us consider Ep for pure states. Let ifj be the rectangular matrix with 
entries ip^y = {x,y\ip). We will often denote Epdi/j)) by Epi^i/j) or Ep{ipxy)- Thus, 

EpiiJxy) = SiiJilj^) . (2.4) 
There always exist unitary matrices U and V such the 

UtpV^ = , (2.5) 

where the rectangular matrix ip is "diagonal", in the sense that ip^y = if x 7^ ?/. 
Eq.(2.5) is called the Singular Value Decomposition [8] of ■0. Define Pi for < i < 

iVx - 1 by 

Uiptp'^U^ = ^tp^ = diag{po,pi, . . . ,Pn^-i) . (2.6) 

Since (alijjip^a) > for any \a) G Tix, the p^'s are non-negative numbers. Further- 
more, smce tiiU-^^ij^W) = Y.x,y \i'xy? = 1, the Pi's add up to one. Note that 
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1^) = Y.^^y\^^y) ' 

x,y 

\'^) = J2 VP^h = x,y = x) . 



(2.7) 



(2.8) 



Eq.(2.8) is called the Schmidt Representation [7] of 1-0). It follows directly from the 
Singular Value Decomposition of tp. By Eq.(2.4) and (2.6), 



(2.9) 



For the remainder of this section, we will restrict our attention to the special 
case where x and y have just two states, and 1. In this case, Epii^xy) = h{pQ), where 
h is the binary entropy function, and where Pq and Pi = 1 — a-re the eigenvalues of 
i/jil)"^ . Define complex numbers Kq, Ki and K by 



Ko K 
K* Ki 



Thus, 



Ko = |^oo|' + l^oil' , 



i^i = i^ioi'+iv^iir, 



K = '0OO'0to + ^OlV'll 



The two eigenvalues of V'V'^ ci'^s 



PO = 7^ , Pi = 1 - , 



where 



t = A{KoKi - \K\^) = 41^/^00^/^11 - V'oiV'iol 
The Bell Basis is defined by 



V2 



-(|0,/o) + (-l)^^|l,/o)), 



for / = (/o, /i) e Bool^. (0 = 1 and 1 = 0.) li x,y e Bool, then 
Let aj for j G Zq^^ be the components of 1-0) in the Bell Basis: 



(2.10) 



(2.11a) 



(2.11b) 



(2.11c) 



(2.12a) 



(2.12b) 



(2.13) 



(2.14) 
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Then 



aol-Boo) + ttil^oi) + O'2|5io) + a^\Bii) . 



ipQi = ;^(^«2 + as) 
-010 = ^(^"2 - as) 
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V2 
1 

V2 



ao — lai) . 



(2.15) 

(2.16) 

(2.17) 

(2.18) 
(2.19) 



Substituting these equations into the definition Eq.(2.12b) of t yields 



E 



(2.20) 



Suppose that Qj = and 9j = phase(a|) for j G Zq^s. Then J2^^oQj ~ ^ ^^'^ 
t = I E?=oQje^^' P- Thus < t < 1 and t = 1 iff the ^/s are all zero (i.e., the afs 
are all real), t = iff Epitpxy) = 0, and t = 1 iff Epli/j^y) is maximum. This is why. 
From Fig.2, it is clear that h{pQ{t)) = Ep{'ipxy) is a monotonically increasing function 
of t which goes from to 1 as t goes from to 1. 





Figure 2: Plot of functions poif) and h{po). 



So far we have discussed Ep for pure states. There are still many unsolved 
mysteries about Ef for mixed states. An example for which definition Eq.(2.3) has 
been evaluated is when p is diagonal in the Bell basis: 



P = '^Wa\Ba){Ba\ , 



(2.21) 



where the WaS are non-negative numbers that add up to one. Ref.[3] shows that for 
this p, 
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it l¥ < 5 

E,(p) = { ,J^^^TZ^-^ ^^^^^^^ , (2.22) 




where 



W = meix{wa) . (2.23) 



3 Some Definitions 

In this section, we will define our measure of entanglement. Future sections will 
explore the properties of our measure, and how it compares with Ep. 

Consider either a QB or CB net with nodes {x.)z^ ■ Suppose that , L2, . . . , 
and E_ are non-empty disjoint node collections of the net. For a CB net, we define 
the H-tanglement HT for n listeners (or receivers) Li,L2, • • • ^ speaker (or 

sender) E_ by 

n 

HT{L, :L,:...: UR) =Y.Hih\E) - H{L„L„ . . . , Lj£) . (3.1) 

i=l 

Analogously, for a QB net we define the S-tanglement ST by 



STp(L, : L2 : . . . : UR) = ^ S,{L,\E) - SpiL„L,, . . . , Lj^) . (3.2) 

i=l 

Here p is any density matrix obtained by reducing the meta density matrix of the 
net, but such that the nodes in L_i,l£2, ■ ■ ■ iRn haven't been reduced. We will 

also use the term max S-tanglement to refer to ST maximized over all local unitary 
operations on the L/s. If STp 7^ for a QB net but HT = for its parent CB 
net, we will describe this situation by saying that there is non- classical tanglement. 
H{Li : L2 : . . . : L„) ( or S{L_i : L2 '■ ■ ■ ■ '■ L.n) ) "^iH be called an H (or S) mutual 
information for n parts. H{Li : L2 : . . . : L^\E) ( or S{Li : : . . . : L^\E) ) will 
be called a.n H (or S) conditional mutual information (c.m.i.) for n listeners. When 
there are two listeners, tanglement equals a c.m.i.. As we shall see later, this is no 
longer the case for more than two listeners. 

Recall from Ref. [5] that a node collection with more than one node is said to 
be compound. Likewise, a listener or speaker with more than one node will be said 
to be compound. 

Suppose that X and Y_ are non-empty disjoint node collections of either a 
CB or a QB net. For a CB net, we will say that X and Y_ are (probabilistically) 
independent (also called separable or uncorrelated) if 
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P{X,Y) = P{X)P{Y) , 



(3.3) 



for all possible X and Y. For a QB net, suppose Px,Y is a density matrix acting on 
Ti-xY and obtained by reducing the meta density matrix of the net. We will say that 
X and Y_ are independent (or separable) if 

Px,Y = Px Py- (3.4) 

Suppose that X, y and E_ are non-empty disjoint node collections of either a 
CB or a QB net. For a CB net, we will say that X and y are conditionally independent 
(or conditionally separable) if 

P{X, Y) = J2 P{X\E)P{Y\E)P{E) , (3.5) 

E 

for all possible X and Y . For a QB net, suppose Px,y,e is a density matrix acting 
on 'Hx^Y,E and obtained by reducing the meta density matrix of the net. We will say 
that 2L and F are conditionally independent (or conditionally separable) if 

Px,y,E = T.pfpr^E\E){E\, (3.6) 

E 

where {\E) \WE} is orthonormal basis corresponding to E_, we > for all E, J2e ^e = 
1, p^x^ acts on Tix, and p^^ acts on Tiy. 

4 ST for 2 Single-node Listeners and a Pure State 

In this section, we will discuss S-tanglement for 2 single-node listeners and a pure 
state. We will show that it equals Ep if we maximize it over all local unitary trans- 
formations on the two listeners. 




Figure 3: Net for 2 single-node listeners and a pure state. 



Consider the QB net of Fig. 3, where 
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nodes 


states 


amplitudes 


comments 


e 


e = (61,62) 




E.,,|<(^,2/)P = 1, 

Ea U*^Uax' = ^x'l 
Eft Ht/Hy' = 


X 


X G Sx 


5(x,ei) 




y 


y^Sy 


^(?/,e2) 





We will sometimes write ipxy instead of ip{x,y). Without loss of generality, we will 
assume that A''^ (the size of set Sx) is less than or equal to Ny. 
The meta density matrix /i of this net is 



= \i^meta) {i^: 



meta \ 1 



where 



Define p by 



One has that 



IV'meta) = ^^lj{x,y)\e = {x,y),x,y) . 



p = tie (p) = J2 ^(^' yWi^^ y) y) {^^ y\ ■ 



St,{x : y\e) = Sf,{x,e) + S^{y,e) - S^{x,y,e) - Sf,{e) 



But /i is a pure state acting on 'Hx,y,e, so 



St,{y,e) = Sf,{x) , 

5'^(^,y,e) = , 

S^,{e) = Sf,{x,y) . 
Substituting Eqs.(4.5) into Eq.(4.4) yields 



(4.1) 
(4.2) 

(4.3) 

(4.4) 

(4.5a) 
(4.5b) 
(4.5c) 
(4.5d) 



S^,{x : y\e) = S^,{x : y) = Sp{x : y) . (4.6) 
Note that p is diagonal in the \x,y) basis so Eq.(4.6) can be simplified further. Let 



P{x,y) = |?A(a;,?/)p 



(4.7) 
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With this P{x,y), one can calculate H{x : y). Eq.(4.6) reduces to 

S^.{x:y\e) = H{x:y) . (4.8) 

Henceforth, we will often abbreviate P{x, y) by Pxy, P{x) = Y^y P{x, y) by 
P^-, and P{y) = J2x P{x, y) by P-y. 

When Sx = Sy = Bool, the unitary matrices U and V mentioned in the above 
table determine what spin direction is measured at the nodes x and y. The above 
table and the following one 



nodes 


states 


amplitudes 


comments 


e 


e = (61,62) 


^O(e) 




X 


X ^ 






y 


y ^ Sy 







do not yield the same Sfj_{x : y\e). In the first table, node e upon which we condition 
has knowledge of U and V, whereas in the second it doesn't. We will call the U 
and V in the first (ditto, second) table a priori (ditto, a posteriori) local unitary 
transformations on x and y. In this section, we are interested in the case of the first 
table, where U and V refer to a priori transformations. 

Suppose ip (ditto, ^/j^) is the rectangular matrix with entries tjjxy (ditto, ip^y). 

Then 

^lJ = Ui:°V^ . (4.9) 

Let us consider the special case that U and V make -0 diagonal. Such a U and V 
always exist by the Singular Value Decomposition Theorem. Suppose that 

i/jiP^ = diag{po,Pi, . . . ,Pjv^-i) • (4.10) 
The Px^s must be non-negative numbers that add up to one. Then 

H{x : y) =Y.Pxy\og, — ^ = E^-logs - = EpO^xy) = M^^y) ■ (4.11) 
Combining the last equation and Eq.(4.8) yields 

S^{x:y\e) = EpitPxy) (4.12) 

for the special case that U and V make -0 diagonal. 

In Appendices A and B, we show the following inequalities: 

EFili'xyl) < Epiijxy) , (4.13) 
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H{x:y)<EF{\'i{j,y\) 
Combining these inequalities and Eq.(4.8) yields 



(4.14) 



Si^ix : y\e) < EpiiP^y) . (4.15) 

From the argument leading up to Eq.(4.12), we see that there exists a pair 
of unitary matrices U and V so that the S-tanglement ST equals the corresponding 
entanglement of formation Ep. From the argument leading up to Eq.(4.15), we see 
that for any U and V, ST is less than or equal to the corresponding Ep. Therefore, 
if ST is maximized over all a priori local unitary transformations U and V on its two 
listeners, then it equals Ep. 



5 ST for 2 Single-Node Listeners and a Mixed State 

In this section, we will discuss S-tanglement for 2 single-node listeners and a mixed 
state. We will show that it vanishes for a conditionally separable state. We will also 
calculate ST for any p which is diagonal in the Bell basis. 
Suppose q^, q^,e are nodes of a QB net. Suppose 

p=E«'aP«pi^^ (5.1) 

a 

where Wa > for all a and Y.a = 1, and where for all a and for A = 1, 2, p[^^ is a 
density matrix acting on TYg^. For such a p, Ep{p) = [3]. To calculate Sp{q^ : q^\a), 
we need a p that acts on a space Hq ,q or larger, so the p in Eq.(5.1) will not do. 
Suppose we consider instead the following p: 

p = 5:^^;,|a)(a|pWpi'^ (5.2) 

a 

where {|a)|Va} is the orthonormal basis for node a. For this p, one has 

Sp{q^ : gja) = Sp{q^,a) + Sp{q^,a) - Sp{q^,q^,a) - Sp{a) , (5.3) 

where 

Sp{q^,a) = H{w) + «;.S(pW) for A = 1, 2 , (5.4) 

a 

Sp{q^,q^,a) = H{w) + ^^^.{^(pW) + S(pf )} , (5.5) 

a 

Sp{a) = H{td) , (5.6) 

so 
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Figure 4: Net that implements a 2 choice conditionally separable density matrix. 



Sp{q^:q^\a) = 0. (5.7) 

Note that the p defined by Eq.(5.2) can be implemented by the QB net of 
Fig. 4, where 



nodes 


states 


amplitudes 


comments 


i 


J = (?,?) 




Eji = 1 




a 






L 


f 


5(f,?) 




for A e Zi,2 




o:\ij 


E,J«AOA|a)p = l 


for A e Zi^2 








Zx for Xe Zi^2 


r\ 







The meta density matrix /i of this net is 

= \i^meta) {i^meta\ 

where 



(5. 



Define p by 



.A=l 



\j = {a,a),a,r = a) . (5.9) 
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Then 

a 

where 

pi^^ = J2 (^x{qx,rx\a)al{q^,rx\a)\qx){q'^ 

all/a, X 

for all a and for A = 1, 2. 

f • 




Figure 5: Net for 2 single-node listeners and a mixed state. 



Next consider the QB net of Fig.5, where 



nodes 


states 


amplitudes 


comments 


/ 


/ 




J2fWf = 1 


e 


e = (61,62) 


{e\ipf) = ipfie) 


Ee\Me)\' = l 


X 


X 


S{x,ei) 




y 


y 







The meta density matrix /x of this net is 

P=\'^ meta meta \ i 

where 

ri 

Define a by 
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a = tif (/i) = ^Wfijf{x,y)'ip}{x',y')\e= {x,y),x,y){e = {x' ,y'),x' ,y'\ . (5.15) 

ri 

We wish to calculate 5'o-(x : y\e). Let 

P{x,y) = J2wf\4jf{x,y)\' . (5.16) 
/ 

We can define a density matrix p{y) for each y & Sy hj 



p{y) 



J2f,x,x' Wf'^f{x,y)ipf{x',y)\x){x'\ 

P{y) 



(5.17) 



In an analogous manner, we can define a density matrix p{x) for each x G S^- It is 
also convenient to define p by 



p = ESe tr/ {p) = J2'^f\^f)(^f\ 
f 



One has that 



Saix : y|e) = S'^(x, e) + S„{y_,^ - S^{x,y,e) - S^{e) . 
Using the observations of Appendix C, one gets 



Saix,e) = S 



T.Piy)\y){y\p{y) 



L y 



Likewise, 



Furthermore, 



and 



Therefore, 



S,{y,e) = Hix) + J2Pi^)S[pix)] . 



Sa{x,y,e) = S{p) , 



Sa{e) = H{x,y) . 



{5.U 



(5.19) 



Hiy)+Y.Piy)S[p{y)]. (5.20) 



(5.21) 



(5.22) 



(5.23) 



S.{x ■.y\e) = H{x : y ) + E Pi^)S[p{x)] + E Piy)S[p{y)] - S{p) . (5.24) 

X y 

Note that if Wf = 5{f,0), then p{x), p{y) and p are all pure states so the 
right-hand side of the last equation reduces to H{x : y). This is what the previous 
section on pure states would lead us to expect. 
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Now consider the case that = Sy = Bool. Let w^^ = J2y=o ^xy, and 
w^y = J2l:=oWxy If we speciahze Eq.(5.24) by assuming that the states \ipf) are the 
Bell Basis states (defined by Eq.(2.13) ), then we obtain 

S^{x:y\e) = h{wo^) + l-H{w). (5.25) 

The last equation gives ST for a Bell diagonal mixture. Ef{p) for this same state was 
given in Eq.(2.22). I'm not sure yet how these two results are connected. Also, note 
that Eq.(5.25) is not yet maximized over all a priori local unitary transformations, 
and one should perform this maximization before comparing it with Ep{p), if one is 
to follow the same rules that were used in the pure state case. 

6 Properties of Tanglement and C.M.I. 

In this section we will discuss various properties satisfied by tanglements and c.m.i.'s. 
The following notation will be used henceforth. 

Often, after stating something about the classical entropy H or the classical 
tanglement HT, we will append to the end of the statement the symbol | h ^ s | to 
indicate that the statement is also valid if one replaces H hj S everywhere. Likewise, 
the symbol | s -> h | will indicate that the previous statement is also valid if we replace 
S hy H everywhere. 

For any set S, its power set Pow{S) is the set of all subsets of S, including 
the null set. For example, Pow{{l,2}) = {0, {1}, {2}, {1,2}} If S has \S\ elements, 
then Pow{S) has 2l'^l elements. For this reason Pow{S) is often denoted by 2"^. 
We will also use Pow{S)j for any j G Zq^\s\ to denote the set of all subsets of S 
which contain j elements. For example, Pow{Zi^^)2 = {{1, 2}, {1, 3}, {2, 3}} Clearly, 
Pow{S) = u\%Pow{S)j. 

For any set S = {ai, a2, . . . , a„}, let {-.aes a) = {■]=! %) = Oi : 02 : . . . : a„. 

Suppose E_, X_i,X_2, . . . , 2Ln with n > 2 are non-empty disjoint node collections 
of a Bayesian net, and Ta for a G Zi^m are non-empty disjoint subsets of We 
will sometimes use the following r, /i shorthand for tanglement and c.m.i.: 

r(ri :T2:...:Tm) = HT[{X.)r, : (X)r, : • • • : (X)r„|£] , (6.1) 



^(Fi : F^ : . . . : F^) = H[{X.)r, : (X)r. : • • • : (X)r„J^] • {EH] (6-2) 
For example, 

t(1 : 2 : (3,4)) = HT{X, : : i2L3,2Li)\E) , (6.3) 
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: 2 : (3,4)) = H{X, : : {X^,X,)\E) . \^ (6.4) 

Sometimes, we will put the argument of r or /i in a subscript (e.g., T1.2), while other 
times we will put it in parentheses (e.g., t(1 : 2)). 

In discussing the following properties, we will use E_, X^,X2, . . . , X„ with n > 
2 to denote non-empty disjoint node collections of a Bayesian net. 

(1) Symmetry 

H I H ^ s I tanglement and c.m.i. are symmetric under permutations of their 
listeners. 

(2) Sign of tanglement 

One has that 

H{X^ : KM) = H{X,\E)+H{X,\E)-H[{2Li,X,)\E] = H{X,\E)-H{X,\E,X^ > . 

(6.5) 

where the inequality follows by strong subadditivity. 

Tanglement is non-negative for any number of listeners, not just two. Indeed, 
an n-listener tanglement can always be expressed as a sum of 2-listener tanglements. 
For example, for 4 listeners, one has 

r(l : 2 : 3 : 4) = r(l : 2) + r((l, 2) : 3) + t((1, 2, 3) : 4) > . [ITT^ (6.6) 



(3) Decomposition of c.m.i. 

In discussing tanglements, c.m.i. 's often arise. Next we will show how to 
express a c.m.i. as a sum of ± non-mutual informations. 
For 2 listeners 

H{X, : X^) = H{X,\E) + H{XM) - H{X„XM) , (6.7) 



H{Xi ■■ KM) = H{Ki,E) + H{K2,E) - H{Ki,K2,E) - H{E) . (6.8) 
For 3 listeners. 
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H{X,:X,:XM) 



H{X,\E) + H{X,\E) + H{X,\E) 

-H{X„X,\E) - H{X„X,\E) - H{X„X,\E) , 
+H{Xi,X„2Csm 



(6.9) 



H{X,:X,:2Ls\E) 



For 4 listeners, 



H{X„E) + H{X„E) + H{X„E) 
-H{X,,X,,E) - H{X,,2C^,E) - H{X,,2£^,E) 
+H{X,,X,,X,,E) 
-H{E) 



(6.10) 



— J2l<a<P<4 H{2La, 2Ll3\E) 

-H{X,,X,,X,,X^\E) 



(6.11) 



H{X,:X,:X,:2Um 



Eq=1 H{2La^E) 

— I]l<a</3<4 H{2Lai 2Ll3i E) 

E) . ^ 
[ -H{E) 



One can show by induction that for n > 2 hsteners, 



(6.12) 



H{:U 2Lx\E) = E H[i2L.)r\E] , ^ (6.13) 

A=l rePou;(Zi,„);, 



^(:a=iXaI^) 



Ea=i(-1)^^^ T.rePow(Zi,n)^ H[iX.)r,E] 
-H{E) 



(6.14) 



For the quantum simple consequence of the above decomposition of 

c.m.i. is as follows. For 2 listeners, 



Sp{X„X„E) = implies Sp{X, : X,\E) = S,{X„X,) . 



(6.15) 



For 3 listeners. 



5'p(Xi,X2,X3,^) = implies S,{X, : X2 : Xsl^) = -Sp{X, : X2 : X3) • (6-16) 



16 



One can show that for n > 2 hsteners, 

^p(Xi,X2, . . . , X„,£) = imphes Sp{X^ : : . . . : XJE) = : ^2 : . . 

(6.17) 

(4) Sign of c.m.i. 

The c.m.i. H{X_i : X_2 : . . . : X_n\E) | h s | is non-negative for n = 2, because 
in that case it equals the tanglement HTjXi : Xj2\E) | h - s | . However, for more than 
2 listeners, the c.m.i. may be positive or negative, as the following example shows. [9] 
A 3 listener c.m.i. will be positive if one of the 3 listeners drops out so that there 
are effectively 2 listeners. Let us construct an example of a 3 listener c.m.i. that is 
negative. Assume the listeners are independent of the speaker E_ so that we can omit 
the conditioning on E_. Eq.(6.9) can be rewritten as 

H{X^ : X2 : X^) = Pos + Neg , (6.18) 

where 

Pos = H{Xi) - HiX.lX^) = H{X, : X^) , (6.19) 

and 

Neg = -{H{X,\X,) - HiX,\X,,X,)} = -H[{X, : X,)\2Q . (6.20) 

As their names suggest, Pos and Neg are positive and negative, respectively. The idea 
is to make X^^ and 2L2 independent so that Pos vanishes. The following probability 
distribution fits that bill: 

F(Xi, X2, X3) = ^[^o''^^^ + 6^^6^l] , (6.21) 

where Xi,X2,X3 G Bool, = 1 and 1 = 0. This distribution gives Pos = and 
Neg = -1. 

(5) Duality between tanglement and c.m.i. 

We wish to express tanglements in terms of c.m.i. 's and vice versa. For 2 
listeners, one finds 

ri:2 = fJ-l:2 , I H ~> S | (6.22) 

For 3 listeners, one finds 
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n:2:3 = /^1:2 + f^l:3 + /^2:3 " Atl:2:3 , |h^s| (6.23) 
^1:2:3 = ^-1:2 + ^1:3 + T-2:3 " ^1:2:3 • |h^s| (6.24) 

For 4 listeners, one finds 



r(l : 2 : 3 : 4) 



M:.erj) + ^(1:2:3:4) 



rGPo«i{Zi,4)2 



rePot«(Zi,4)3 



(6.25) 



Ml:2:3:4)= ^ r(:,er j) - E r(:,gr .7) + r(l : 2 : 3 : 4) . 

rePot«(Zi,4)2 rGPo«i(Zi,4)3 

(6.26) 

One can show by induction that for n > 2 hsteners 

n 

r(l : 2 : . . . : n) = E(-l)' E /^(v'^r j) , (6.27) 
A=2 rePou>{Zi,„);v 

n 

/i(l:2:...:n) = 5:(-l)^ r{:j^r j) ■ (6.28) 

A=2 rePo«>{Zi,„)A 

An interesting aspect of Eqs.(6.27) and Eqs.(6.28) is that they transform into each 
other when one exchanges the symbols r and fi. Therefore, we will call such equations 
duality equations, and say that they describe a duality between tanglement and c.m.i.. 

(6) Merging two listeners 

It is easy to check that for n > 2, 



HT{X_i : 2L2 2Ln ■ 2Ln+i\E) - HT{X_i : X_2'- ■ Kn-l ■ {2Ln, 2Ln+l)\E) = 
= HT{X^:X^^,\E)>0 ' ^ 

(6.29) 

In r notation, 

r[l : 2 : . . . : n - 1 : (n, n + 1)] < r[l : 2 : . . . : n : n + 1] . [ITT^ (6.30) 
For example, 

7-1:2,3 < Tl;2:3 • |h^s| (6.31) 

Thus, "merging" two listeners decreases tanglement. Since tanglement is non-negative, 
if the right-hand side of this inequality is zero, so is the left-hand side. 
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(7) Pruning or removing a listener 

It is easy to check that for n > 2, 



HT{X, : X2 : . . . : X„-i : - HT{X, : X, : . . . : X^\E) = _ 

= HT[{2(_i, 2L21 ■ ■ ■,2Ln-l) ■ Kn+l\2LniE) > 

(6.32) 

In r notation, 



r[l : 2 : . . . : n - 1 : n] < t[1 : 2 : . . . : n - 1 : (n, n + 1)] . flTT^I (6.33) 
For example, 

ri:2 < n:(2,3) • I H S | (6.34) 

Thus, "pruning" a listener (i.e., removing some but not all of its nodes) decreases 
tanglement. Since tanglement is non-negative, if the right-hand side of this inequality 
is zero, so is the left-hand side. 

And what happens if we remove all the nodes of a listener? It is easy to check 
that for n > 2, 



HT{X_l ■ K2 2Ln- Kn+l)\E) - HT{2Ll ■ 2L2- ■■■ ■ Kn\E) = of-N 

= HT[{X,,X^, . . . ,2Cn) : 2C„+i|£) > • ^^-^^^ 

In the r notation, 

r(l : 2 : . . . : n) < r(l : 2 : . . . : n : n + 1) . [ITT^ (6.36) 

For example, 

ri:2 < ^1:2:3 • I H ~> S | (6.37) 

Thus, completely "removing" a listener also decreases tanglement. Since tanglement 
is non-negative, if the right-hand side of this inequality is zero, so is the left-hand 
side. 

Note that if Ti,2:,,,:n = for some n, then Hi;2:...:n = 0. Indeed, by the dual- 
ity equations, fj.i:2;...:n can be expressed as a sum of ± r's obtained from Ti,2:,,,:n by 
removing some of its listeners. But all such r must be zero because ri:2:...:n = and 
removing listeners decreases tanglement. 

(8) Decomposing compound listeners of tanglement and c.m.i. 



19 



(6.38) 



Tl,2:3,4 = Tl:2:3:4 - Tl:2 - T3:4 , |h^s| (6.39) 
Tl:2,3:4,5,6 = ^1:2:3:4:5 " 7"2:3 " ^4:5:6 • |h^s| (6.40) 

Note that compound listeners in the left-hand side are "split" in the right-hand side. 
More generally, suppose that E_, X^, X2, . . . , 2Ln ^^t some n> 2 are non-empty disjoint 
node collections of a Bayesian net, and for a G ^ are non-empty disjoint subsets 
of Zi „. Then 

m 

HT[:^=, {X.)rM = i/T[:,gr,ur....r„ (X),|^]-E HT[-jer^ K^IM , (6.41) 

a=l 

where we define HT[:j(zr^ 2Lj\E\ = if Fq has only one element. In r notation, 

m 

r(Fi : F2 : . . . : F^) = r(:j6riur2...r^ j) - ^(-ier, i) ' (6-42) 

a=l 

where we define r(:jgrQ j) = if Fq, has only one element. Thus, any tanglement 
which has compound listeners can be expressed as a sum of ± tanglements whose 
listeners are smaller (i.e., have fewer nodes). 

Note that given a c.m.i. with compound listeners, one can: (1) use the duality 
equations to express the c.m.i. as a sum of ± tanglements; (2)use the results of this 
section to express the tanglements obtained in step 1 as a sum of ± tanglements 
which have smaller listeners; (3)use the duality equations to express the tanglements 
obtained in step 2 as sum of ± c.m.i. 's. For example, 

^1:2,3 = n:2,3 = ^1:2:3 " ^2:3 = f^l:2 + /^1:3 " P-l:2:3 ■ | H -> S | (6.43) 

Thus, any c.m.i. which has compound listeners can expressed as a sum of ± c.m.i. 's 
whose listeners are smaller. 

Another way of decomposing the compound listeners of a c.m.i. is by using 
the following "chain rule": 

n 

H[X, : iX„X„ . . . , X„)|£] = E ^[^1 ■■ 2Lx\iX^+i, . . .,X^,E)] . (6.44) 

A=2 

For example. 
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H[X^ : {X^.X^^X,)\E] = +H[X, : X^, \2Ci.2L,.E]+H[X, : X^\X^, E]+H[X^ : )UE] 

(6.45) 

This rule is also valid for more than 2 listeners. For example, it can be used to 
decompose the listeners of /i((l,2) : (3,4) : (5,6,7)). 

(9) Conditionally separable states 

Suppose 

P(Xi, X2, . . . , E) = P{Xi\E)P{X2\E) . . . P{Xn\E)P{E) (6.46) 

for all values of Xi, X2, . . . , Then HT{X^ : X^ : . . . : X^\E) = If the 

speaker ^ is a single node e, and for each A, the listener X_x is a single node Xx, then 
Eq.(6.46) is satisfied by the CB net in Fig.6. 




Figure 6: Net with one speaker and n listener nodes. 

So far we've only considered the classical case. The analogous result in the 
quantum case is as follows. Suppose that p is defined by 

p = Y.ME){E\p^^pf ...pf , (6.47) 

E 

where the t«£;'s are non-negative numbers that add up to one, where {|i?)|Vi?} is an 
orthonormal basis for 7i^, and where for all A G Zi „ and for all E^ p^"* acts on Hx^. 
The Hilbert spaces 'Hx^ for all A and TIe are different spaces. Then STp{X_i : X_2 : 
■ ■ ■ ■ 2Ln\E) = 0. If the speaker ^ is a single node a, and for each A, the listener 2(_x 
is a single node Xx, then the p of Eq.(6.47) can be implemented by a QB net with a 
graph like the one in Fig. 4, but such that a has n branches instead of just 2. 

We showed previously that Ti:2:...:n = implies pi:2:...:n = 0. The converse 
statement is not true (for n larger than 2). Next we will give an example of a 
situation in which the c.m.i. is always zero but the tanglement may be non-zero. 

Suppose n >2 and Fi, r2 are non-empty disjoint sets such that riUr2 = ^i,n- 
In the classical case, assume 
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P(Xi, X2, ...,X^,E) = P[{X.)rAE]P[{X.)r,\E]P{E) (6.48) 
for all values of Xi, X2, . . . , X„, E. In the quantum case, assume 

p = Y.ME){E\p^^p%\ (6.49) 

E 

where the we''^ are non-negative numbers that add up to one, and where for A G Zi 2 
and for all E, p^-* acts on ^{x.)t^- Then Hi;2:...:n = 0. We won't give a completely 
general proof of this theorem. We will only prove it for n = 4. 
One of the duality equations is: 

/Ul:2:3:4 = ^ - 5 + C , flTTT] (6.50) 

where 

A = Ti:2 + Ti:3 + Ti-a + T2:3 + T2:4 + T3:4 , |h^s| (6.51) 

B = ri:2:3 + ^1:2:4 + ^1:3:4 + r2:3:4 , |h^s| (6.52) 

C = ri.2:3:4 • | H ^ S | (6.53) 

First suppose that Ti = {1, 2} and = {3, 4}. Then r(ri : Tg) = 0. If r[ (ditto, T'^) 
is a non-empty subset of Fi (ditto, F2), then, because removing listeners decreases 
tanglement, t(T[ : Fg) = 0. Using Eq.(6.42) to decompose the compound listeners of 
^(r'l : F'2), one gets 

^(:ier;ur^ j) = T{-jer[ j) + ^(:jer^ j) ■ [^^^ (6-54) 
Using Eq.(6.54), one gets 

A = Ti:2 + T3:4 , | H ^ S | (6.55) 

B = 2(ri.2 + r3:4) , (6-56) 

C = ri:2:3:4 , |h^s| (6.57) 

so 

/il:2:3:4 = -^1:2 " ^3:4 + ri:2:3:4 = . | H -> S | (6.58) 

Next suppose that Fi = {1} and F2 = {2,3,4}. Using Eq.(6.54), one gets 

A = T2:3 + r2:4 + T3:4 , |h^s| (6.59) 
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(6.60) 



C 



^2:3:4 



(6.61) 



SO 



/^1:2;3:4 



(6.62) 



(10) A posteriori local unitary transformations 

In Section 4, we distinguished between a priori and a posteriori local unitary 
transformations, and we maximized ST over all a priori transformations. Next we 
will show that ST is in fact invariant under a posteriori local unitary transformation. 
For definiteness, we will calculate ST for a pure state and 2 single-node listeners, 
but analogous conclusions hold for a mixed state and n > 2 either single-node or 
compound listeners. 




Figure 7: Net with one speaker node and 2 branches, each branch with 2 nodes. 
Consider the QB net of Fig. 7, where 



nodes 


states 


amplitudes 


comments 


e 


e = (61,62) 


He) 


Ee 1^^(6)1^ = 1 


X 


X 


6{x,ei) 




y 


y 


5(?/,62) 




a 


a 




Ea U*^Uax' = ^x' 


b 


b 


Uhy 





Let be the QB net which contains all the nodes shown in Fig. 7. Let Mq be the 
sub-net which contains only nodes e, x_ and y. 
The meta density matrix /io of is 
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^^0=\i^meta){'^^meta\ ^ (6-63) 

where 

IV'meta) = H ^(^^^ ?/) I e = {x,y),x,y) . (6.64) 

ri 

The meta density matrix ji of Af^ is 

;U = |^Ameta)(^meta| , (6.65) 

where 

\i^nieta) = axVhyil^{x , y)\e = {x , y) , X , y , tt, b) . (6.66) 

ri 

This last equation can be rewritten as 

\ipm.eta) = X] 2/) I ^ = (x, y) , X, y)\(paix))\(t)b{y)) , (6.67) 

ri 

where 

l0a(x))=E^-|a), |0.(y)) = EH,|fc) . (6.68) 

a b 

The |0a,(x))'s (ditto, |0b(?/))'s ) are an orthonormal basis in Ha (ditto, Hb) labelled 
by the indices x (ditto, y). 
Define p by 

p = ES^,y (/i) = Y,'^{x,yW{x',y')\e = {x,y),(j)a{x),(f)b{y)){e= {x',y'),(j)a{x'),(t)b{y')\ ■ 

ri 

(6.69) 

The only difference between p and /xq is that the (j)a{x) and (pb{y) indices in p are 
replaced by x and y in /iQ. Thus, 

5'p(a : 6|e) = S'^(,(^ • • (6-70) 

In other words, ST for net A/"*^, density matrix p and listeners a and b equals ST 
for sub-net N'^ , density matrix /xq and listeners x and y. Note that in the definition 
Eq.(6.69) of p, we e-summed p over x and y. Consider a density matrix cr defined by 
trace-ing instead of e-summing over x, y: 

a = tT.^,y (/i) =J2i'ix,y)'ilj*{x,y)\e= {x,y),(paix),(f)biy)){e = {x,y),(f)a{x),(l)biy)\ ■ 

ri 

(6.71) 

It is easy to show that 



24 



Sa{a:b\e)=0. (6.72) 

Thus, e-summing over x and y (which corresponds to not measuring those nodes) 
gives the same ST as if the local transformations at nodes a, b had not occurred. On 
the other hand, trace-ing over x and y (which corresponds to measuring those nodes 
in a particular way) gives zero ST, just as in the classical case. 

(11) Conditional Data Processing Inequalities 

An introduction to Data Processing (DP) Inequalities for CB and QB nets may 
be found in Ref. [5]. Here, we will prove a new version of these inequalities which we 
call Conditional DP Inequalities. The Conditional DP Inequalities are conditioned on 
a speaker. Thus, they are closely linked to the phenomenon of tanglement. Consider 
the net of Fig. 7. What we will show is that 

H{a : b\e) < H{x : y\e) . flTTT] (6.73) 

In the quantum case, we've shown in the previous section entitled "A posteriori 
local unitary transformations" that if nodes a and b correspond to unitary transfor- 
mations and nodes x and y to delta functions, then equality is attained in inequality 
Eq.(6.73). No such assumptions about the nature of the transition matrices of the 
nodes will be made in this section. Our assumptions are only that the QB net has a 
particular topology, that of Fig. 7. 

Clearly, the Conditional DP Inequalities of this section can be greatly general- 
ized in the same way that Ref. [10] generalizes DP Inequalities from a simple Markov 
chain to arbitrary CB or QB nets. In this section, we will discuss only the simplest 
case of the Conditional DP Inequalities. More general cases will be discussed in a 
future paper dedicated exclusively to this subject. 

Eq.(6.73) has a simple interpretation, as all DP inequalities do. It says that 
the conditional information transmission between x and y is larger than that between 
a and b because the first pair of nodes is "closer" . Alternatively, one can say that the 
probabilistic dependency oi x on y is larger than that between a and b because the 
first pair of nodes is "closer" . 

First note that the graph of Fig. 7 satisfies 

H{a\e,y,b) = H{a\e,y) . | h ^ s | (6.74) 

In the classical case, this follows because P{a\e,y,b) = P{a\e,y). By virtue of 
Eq.(6.74) and strong subadditivity, 

H{a\e,y) = H{a\e,y,b) < H{a\e,b) . | h ^ s | (6.75) 

Subtracting H{a\e) from each term of the last equation and multiplying the resulting 
equation by —1 gives 
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H{a : y\e) > H{a : b\e) . [ITTTI (6.76) 
Now note that the graph of Fig. 7 satisfies 

H{y\e,x,a) = H{y\e,x) ■ | h ^ s | (6.77) 

In the classical case, this follows because P{y\e,x,a) = P{y\e,x). By virtue of 
Eq.(6.77) and strong subadditivity, 

H{y\e,x) = H{y\e,x,a) < H{y\e,a) . | h -> s | (6.78) 

Subtracting H{y\e) from each term of the last equation and multiplying the resulting 
equation by —1 gives 

H{y : x\e) > H{y : a|e) . [¥^77] (6.79) 
Combining Eqs.(6.76) and (6.79) gives 

H{a : h\e) < H{a : y\e) < H{x : y\e) . [ITTTI (6.80) 

QED. 




Figure 8: Net with one speaker node and n branches, each branch with 2 nodes. 

Eq.(6.73) can be easily generalized to n > 2 listeners. Consider the graph of 
Fig. 8. Next we will show that for this graph, 

HT{ai : 0,2 '■ ■ ■ ■ '■ Q^uISl) ^ HT{xi : ^2 : . . . : Xn\e) . | h ^ s | (6.81) 

The proof is by induction on n > 2. Eq. (6.81) has been proven for n = 2. If it is 
true for all n G ^2,no; then is must be true for n = no + 1. Here is why. By virtue of 
the induction hypothesis, the following two inequalities must be true: 

HT[{a^,a2, . . . ,a„J : a„o+i|e] < HT[{x^,X2. . . . : x^^+i\e\ , HTTTI (6.82) 
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HT[ai : 0,2 : • • • : «no|e] < HT[xi : : . . . : x„Je] . | h s | (6.83) 

The sum of the left-hand sides (ditto, right-hand sides) of these two inequahties equals 
HT{ai : 0-2 : . . . : a„o+i|e) (ditto, HT{xi : X2 : . . . : ai„(,+i|e)) | h -> s | . QED 

A Proof that EpH^pxyl) < Epii/Jxy) 

We will first prove this inequality for the case that = Sy = Bool. Define the 
function po{t) for t G [0, 1] by 



Po(*) = . (A.l) 

From Eqs.(2.9) and (2.12), 

^F(^xy) = hipoit)) , (A.2) 

where 

t = 4|V^oo^ii -^01^10 P • (A.3) 

Let 

t' = 4(|^oo^n|-|^oiV^io|)'. (A.4) 

Note that 

EF{\i^.y\) = h{po{t')) . (A.5) 

By the triangle inequality, 

t' <t . (A.6) 
From Fig. 2, h{po{t)) is a monotonically increasing function of t. Thus 

EF{\i^.y\) = h{po{t')) < h{po{t)) = Epii^^y) . (A.7) 
Now consider the case of arbitrary A''^, Ny such that < Ny. Recall 

i?F(^x,) = S{p) , (A.8) 

where 

p = i)i)K (A.9) 
For all x, y, define O^y to be the phase of tp^y Then 
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(A. 10) 



Suppose we vary the angles 6xy Then 



5S{p) = -5ti 



In p 
'h^ 



-tr 



6p_ 
In 2 



(lnp + 1) 



(A.ll) 



where 



Spxx' = "^iiSOxy - S6x'y)iJxyi^x'y ■ 



(A.12) 



When 9xy = for all x and y, 6pxx' is antisymmetric and pxx' is symmetric under the 
exchange of x and x'. U A and 5" are, respectively, an antisymmetric and a symmetric 
X matrix, then tr(yl) = tr^AS) = 0. Thus, tr{6p) = tr(5plnp) = 0. Thus, 
SS{p) = when 6xy = for all x and y. I don't know how to show for general values 
of Nx and Ny that this extremum of S{p) is a global minimum. 



B Proof that H{x : y) < Ef{^/P^) 

In this appendix, we will prove an inequality which gives an upper bound for the 
classical mutual information H{x : y). From H{x : y) = H{x) —H{x\y) and H{x\y) > 
0, it follows that 

H{x : y) < mm{H{x),H{y)} . (B.l) 

What we seek here is a tighter upper bound for H{x : y). 

Suppose X (ditto, y) is a random variable that can assume values in a set Sx 
(ditto, Sy) which contains Nx (ditto, Ny) elements. Let Pxy be the joint probability 
distribution of x and y. Let Px- = Pxy and P_y = J2x Pxy Without loss of 
generality, we will assume that Nx < A^^- 

Define \1/ to be the rectangular matrix with entries 

"^xy = ^y . (B.2) 

Note that 

tr(^^^) = E^^. = E^x'. = l- (B-3) 

x,y x,y 

Let 

^ = U^V'^ , (B.4) 
where U and V are (real) orthogonal matrices. Define 
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Then 



(B.5) 



Pxy = tr(^^^'^) = tr(^^^) = 1 . (B.6) 



x,y 



Define rj by 



x,y ^x-^-y 



Note that 



H{x : y) ^ 



(B.8) 

U=V=1 



In 2 

where the right-hand side is evaluated at U = V = 1. Our goal is to show that: (1) 
rj has a global maximum when it varies over the spaces of all orthogonal A'^ x 
matrices U and all orthogonal A^^ x Ny matrices V; (2) the maximum occurs when U 

and V make \^ diagonal. (Such a U and V exist by the Singular Value Decomposition 
Theorem). When is diagonal, 

^=Y: PxX 10g2 ^ = E^i^^y) = EF{^,,y) = Epi^y) . (B.9) 

^ X ^xx 

Therefore, if rj has a global maximum when \1/ is diagonal, then 

H{x:y)<EF{^y) . (B.IO) 
Suppose we vary each P^y by dP^y in such a way that 

Y.5P^y = ^. (B.ll) 

x,y 

(And therefore also Y,x ^Px- = J2y ^P-y = 0.) Then 

Sr^ = Y,{6P^^)\nJ^+ ml, (B.12) 

x,y ^x-^-y 

where 

X,J/ V -fx- -J/ / 

Because of Eq.(B.ll), nil = 0. 

?7 and V are orthogonal and we will vary them so that U + 6U and V + 
are also orthogonal. Thus, J2xy{Pxy + (^-Prry) = 1- Thus, Eq.(B.ll) is satisfied. 
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For A'^ = Ny = 2, U and V can be parameterized by expressing them as 



u = 


Ci Si 


, v = 


C2 


S2 




-Si Ci 




-S2 


C2 



(B.14) 



where Cj = cos 6j, Sj = sinOj for j = 1,2. Then we can vary U and V by varying the 
angles ^i, ^2- For general and A^^^, we can express U and V a.s U = 6°" and V = e^, 
where a and P are antisymmetric matrices. Then we can vary U and V by varying 
the components of a and P that lie above their main diagonal. 
One gets 

5P,y = 2^,y5^,y , (B.15) 

and 



= {5U)^V'^ + U^{6V^) = A* + , (B.16) 

where 

A = {5U)U^, B = V6V^ . (B.17) 

Because UU^ = 1, (SU)U'^ + U6U^ = 0, which can be expressed in terms of A as 
A = —A'^, Thus, A must be antisymmetric. B must be antisymmetric too. 

Next we will show that if U and V are such that is diagonal, then 6Pxy = 
for all X and y, and therefore, by Eq.(B.12), 5rj = 0. Consider some x,y such that 
X y; for example, x = 0,y = 1. Since \^oi = 0, Eq.(B.15) implies 5Poi = 0. 
Consider some x, y such that x = y; for example, x = y = 0. J2a ^oa^ao = because 
when a = 0, Aqq = 0, and when a 7^ 0, "^ao = 0. Likewise, J2b^obBbo = 0. Thus, by 
Eq.(B.16), 5^00 = 0. Since S^oo = 0, Eq.(B.15) implies 6P00 = 0. 

So far we have shown that 6r] = when \& is diagonal. It remains for us to show 
that this extremum is a global maximum. I don't know how to show this. However, 
my Monte Carlo tests support this claim. Furthermore, the following argument shows 
that the extremum is at least a local maximum. One has 

S'V = j:i^'P.y) In (^^] + ml' , (B.18) 



> P P 

x,y \^ x-^ -y, 



where 



p ^ p P 

x,y ^ xy X ^ X— y ^ —y 



If ^ is diagonal, then 5Pxy = for all x and y so nil' = 0. One has 

6''Pxy = S[2^xyS^xy] = 2i6^xyY + 2^xy5^^xy . (B.20) 
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If \& is diagonal, then = 2{6^xyy > for any x ^ y. But P^y = for x 7^ ?/ so 

5^?7 —00. Thus 7] has a local maximum when is diagonal. In fact, r] has a cusp 
there. The cusp is on the boundary of the region on which P^y is defined. 

C Entropy of Density Matrix 
with Repeated Index Pairs 

Often in this paper we need to evaluate the entropy of a density matrix such as 

i? = ^ Ra^a'lOi = a, b = a) {a = a' , b = a'\ , (C.l) 

a, a' 

where the nodes a and b have the same states {Sa = Sb)- By an "index pair" of a 
matrix M we mean the row and column indices of an entry of M. The index pair 
(a, a') is repeated in R. Consider the smaller density matrix 

p = Y.Ra,a'\a){a'\ . (C.2) 

a, a' 

Next we will show that S{R) = S{p). Thus, for the purpose of evaluating its entropy, 
one can replace the density matrix R by the smaller p. The proof consists of showing 
that R and p have the same non-zero eigenvalues. Indeed, suppose |0) G Hg. is an 
eigenvector of p: 

= A|0) . (C.3) 

Then |$) defined by 

\<t>) = J2\a = a',b = a'){a = a'\(P) (C.4) 

a' 

is an eigenvector of R with the same eigenvalue A. Indeed, 

=Y^R^^^,\a = a,b = a){a = a\(P) = A|$) . (C.5) 

a, a' 

Thus, the set of eigenvalues of R contains the set of eigenvalues of p. From the matrix 
representation of R, it is clear that any eigenvalue of R which is not an eigenvalue of 
p must be zero. 
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